Semi-supervised training for bottle-neck feature based DNN-HMM hybrid systems

نویسندگان

  • Haihua Xu
  • Hang Su
  • Chng Eng Siong
  • Haizhou Li
چکیده

In this paper, we investigate semi-supervised training (SST) method in various state-of-the-art acoustic modeling techniques, using bottle-neck and corresponding tandem features. These techniques include subspace GMM, tanh-neuron deep neural network (DNN), and a generalized soft-maxout (p-norm) DNN. We demonstrate that SST may lead up to 2% Word Error Rate (WER) reduction using all these techniques in each case, and the best one comes from tandem feature based p-norm DNN system. In addition to recognition performance, effectiveness of the SST on keyword search performance is also investigated. Results on Actual Term Weighted Value (ATWV) are reported, with an analysis on lattice density. It is shown that SST may not necessarily increase ATWV due to the shrink of lattices size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

BUT 2014 Babel system: analysis of adaptation in NN based systems

Features based on a hierarchy of neural networks with compressive layers – Stacked Bottle-Neck (SBN) features – were recently shown to provide excellent performance in LVCSR systems. This paper summarizes several techniques investigated in our work towards Babel 2014 evaluations: (1) using several versions of fundamental frequency (F0) estimates, (2) semi-supervised training on un-transcribed d...

متن کامل

Exploiting foreign resources for DNN-based ASR

Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance ...

متن کامل

Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling

In this article, we propose an uncertainty decoding scheme for DNN-HMM hybrid systems based on numerical sampling. A finite set of samples is drawn from the estimated probability distribution of the acoustic features and subsequently passed through feature transformations/extensions and the deep neural network (DNN). Then, the nonlinearly-transformed feature samples are averaged at the output o...

متن کامل

An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems

In this paper, we advance a recently-proposed uncertainty decoding scheme for DNN-HMM (deep neural network hidden Markov model) hybrid systems. This numerical sampling concept averages DNN outputs produced by a finite set of feature samples (drawn from a probabilistic distortion model) to approximate the posterior likelihoods of the context-dependent HMM states. As main innovation, we propose a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014